- Goals
- Simple Neural Networks
- Deep Neural Networks
- Towards real data applications
12/01/2020
How to minimize these loss functions? We use gradient descent (via back-propagation) to find \(\widehat{\mathbf{w}}\)!
For those who are interested: stochastic gradient descent, mini-batches, adam algorithm
Coding complex Neural Networks can be challenging: thankfully, there are some existing frameworks that do it for us. Check out https://www.tensorflow.org/
We slide the orange matrix over our original image (green) by 1 pixel (also called stride) at the time and for every position, we compute element wise multiplication and add the result to get the corresponding element of the output matrix (pink).
Apply ReLU (element wise) to the feature map to introduce non-linearity.
Max pooling progressively reduces the spatial size of each feature map while keeping the most important information. It reduces the amount of parameters and computation in the network, and hence also control overfitting.
The output from the convolutional and pooling layers represent high-level features of the input image. The purpose of the Fully Connected layer is to use these features for classifying the input image into various classes based on the training dataset.
Please fill out the anonymous electronic course evaluation. Feel free to leave your feedback, the course can always improve thanks to students’ input!